rank | frequency | n-gram |
---|---|---|
1 | 18922 | -a |
2 | 10662 | -e |
3 | 8272 | -i |
4 | 5589 | -o |
5 | 2095 | -u |
rank | frequency | n-gram |
---|---|---|
1 | 4612 | -la |
2 | 3353 | -le |
3 | 2736 | -ni |
4 | 2564 | -wa |
5 | 1708 | -we |
rank | frequency | n-gram |
---|---|---|
1 | 2162 | -ela |
2 | 1385 | -ele |
3 | 1299 | -eni |
4 | 1161 | -isa |
5 | 1132 | -ile |
rank | frequency | n-gram |
---|---|---|
1 | 546 | -anga |
2 | 492 | -lela |
3 | 412 | -elwa |
4 | 400 | -weni |
5 | 332 | -iswa |
rank | frequency | n-gram |
---|---|---|
1 | 208 | -ekile |
2 | 184 | -langa |
3 | 168 | -ileyo |
4 | 159 | -elela |
5 | 129 | -andla |
The tables show the most frequent letter-N-grams at the ending of words for N=1…5. Everything runs in parallel to 2.2.5 Most frequent word beginnings. The aim is suffix detection instead of affix detection.
For N=3:
SELECT @pos:=(@pos+1), xx.* from (SELECT @pos:=0) r, (select count(*) as cnt ,concat("-", right(word,3)) FROM words WHERE w_id>100 group by right(word,3) order by cnt desc) xx limit 5;
2.2.5 Most frequent word beginnings